Prediction of Carbohydrate Binding Sites on Protein Surfaces with 3-Dimensional Probability Density Distributions of Interacting Atoms
نویسندگان
چکیده
Non-covalent protein-carbohydrate interactions mediate molecular targeting in many biological processes. Prediction of non-covalent carbohydrate binding sites on protein surfaces not only provides insights into the functions of the query proteins; information on key carbohydrate-binding residues could suggest site-directed mutagenesis experiments, design therapeutics targeting carbohydrate-binding proteins, and provide guidance in engineering protein-carbohydrate interactions. In this work, we show that non-covalent carbohydrate binding sites on protein surfaces can be predicted with relatively high accuracy when the query protein structures are known. The prediction capabilities were based on a novel encoding scheme of the three-dimensional probability density maps describing the distributions of 36 non-covalent interacting atom types around protein surfaces. One machine learning model was trained for each of the 30 protein atom types. The machine learning algorithms predicted tentative carbohydrate binding sites on query proteins by recognizing the characteristic interacting atom distribution patterns specific for carbohydrate binding sites from known protein structures. The prediction results for all protein atom types were integrated into surface patches as tentative carbohydrate binding sites based on normalized prediction confidence level. The prediction capabilities of the predictors were benchmarked by a 10-fold cross validation on 497 non-redundant proteins with known carbohydrate binding sites. The predictors were further tested on an independent test set with 108 proteins. The residue-based Matthews correlation coefficient (MCC) for the independent test was 0.45, with prediction precision and sensitivity (or recall) of 0.45 and 0.49 respectively. In addition, 111 unbound carbohydrate-binding protein structures for which the structures were determined in the absence of the carbohydrate ligands were predicted with the trained predictors. The overall prediction MCC was 0.49. Independent tests on anti-carbohydrate antibodies showed that the carbohydrate antigen binding sites were predicted with comparable accuracy. These results demonstrate that the predictors are among the best in carbohydrate binding site predictions to date.
منابع مشابه
Predicting Ligand Binding Sites on Protein Surfaces by 3-Dimensional Probability Density Distributions of Interacting Atoms
Predicting ligand binding sites (LBSs) on protein structures, which are obtained either from experimental or computational methods, is a useful first step in functional annotation or structure-based drug design for the protein structures. In this work, the structure-based machine learning algorithm ISMBLab-LIG was developed to predict LBSs on protein surfaces with input attributes derived from ...
متن کاملProtein-Protein Interaction Site Predictions with Three-Dimensional Probability Distributions of Interacting Atoms on Protein Surfaces
Protein-protein interactions are key to many biological processes. Computational methodologies devised to predict protein-protein interaction (PPI) sites on protein surfaces are important tools in providing insights into the biological functions of proteins and in developing therapeutics targeting the protein-protein interaction sites. One of the general features of PPI sites is that the core r...
متن کاملBioinformatics prediction and experimental validation of VH antibody fragment interacting with Neisseria meningitidis factor H binding protein
Objective(s): We previously conducted an in silico research on the interactions between the ribosome display-selected single chain variable fragment (scFv) and factor H binding protein (fHbp) of Neisseria meningitidis. We found that heavy chain variable (VH) fragment of this scFv had considerable affinity to fHbp. These results led us to evaluate the ability of this sm...
متن کاملIdentification of RNA-binding sites in artemin based on docking energy landscapes and molecular dynamics simulation
There are questions concerning the functions of artemin, an abundant stress protein found in Artemiaduring embryo development. It has been reported that artemin binds RNA at high temperatures in vitro, suggesting an RNA protective role. In this study, we investigated the possibility of the presence of RNA-bindingsites and their structural properties in artemin, using docking energy ...
متن کاملAdenine molecule interacting with golden nanocluster: A dispersion corrected DFT study
The interaction between nanoparticles and biomolecules such as protein andDNA is one of the major instructions of nanobiotechnology research. In this study,we have explored the interaction of adenine nucleic base with a representativegolden cluster (Au13) by using dispersion corrected density functional theory(DFT-D3) within GGA-PBE model of theory. Various active sites ...
متن کامل